Monte Carlo semantics : robust inference and logical pattern processing with natural language text
نویسنده
چکیده
This thesis develops several pieces of theory and computational techniques which can be deployed for the purpose of allowing a computer to analyze short pieces of text (e.g. ‘Socrates is a man and every man is mortal.’) and, on the basis of such an analysis, to decide yes/no questions about the text (‘Is Socrates mortal?’). More particularly, the problem is seen as a logical inferencing task. The computer must decide whether or not a logical consequence relation ‘therefore’ holds between the two pieces of text. (‘Socrates is a man and every man is mortal, therefore Socrates is mortal.’) This problem is a pervasive theme in logic and semantics but has also been subject over the last five years to a wave of renewed attention in computational linguistics sparked by the Recognizing Textual Entailment (RTE) challenge. A critical reevaluation of this line of work is presented here which demonstrate several problems concerning the empirical methodology used at RTE and the results derived from it. This thesis is thus more theorydriven, but nevertheless inspired by RTE in that it addresses problems raised by RTE which have not previously received sufficient attention from a theoretical viewpoint, such as the problem of robustness. With this goal in mind, two of the results on Natural Language Reasoning (NLR) established here become particularly important: (1) Assuming the syllogism as a benchmark fragment of NLR, the model theory which underlies NLR is not necessarily a two-valued logic, but it can be the many-valued Łukasiewicz logic. (2) Despite the fact that the syllogism is a logical language of less expressive power than natural language as a whole, a good approximation to NLR can still be obtained by using the method outlined here for rewriting natural language text into syllogistic premises. These two properties of NLR enable the approach to robust inference and logical pattern processing called Monte Carlo semantics, which, in turn, demonstrates that a single logically based theory can account for the semantic informativity of deep techniques using theorem proving and for the robustness of bag-of-words shallow inference.
منابع مشابه
Monte Carlo Semantics: McPIET at RTE4
The Monte Carlo Pseudo Inference Engine for Text (MCPIET) addresses the RTE problem within a new theoretic framework for robust inference and logical pattern processing based on integrated deep and shallow semantics. In this report we outline, in some detail, this new theoretic framework, and we will use it to shed some light on the informativity and robustness characteristics for the extreme c...
متن کاملMonte Carlo Semantics: Robust Inference and Logical Pattern Processing Based on Integrated Deep and Shallow Semantic Representations
This document was submitted to the University of Cambridge Computer Laboratory as part of the documentation required by first year PhD candidates comprising a thesis proposal (Bergmair, 2007a) and a first year report (this document). In addition, a thesis draft (Bergmair, 2007b) has been submitted to supplement the required material. – For readers other than the examiners of this PhD project, i...
متن کاملRobust Semantics for Semantic Parsing
The paper presents a robust semantics for NLP applications including QA, text entailment and SMT that combines a (fairly) standard treatment of logical operators such as negation and quantification (Steedman 2012) with a highly nonstandard paraphrase-and entailment-based semantics of relational terms derived from text data by machine reading (Lewis and Steedman 2013a; 2013b). I'll consider the ...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملSampling First Order Logical Particles
Approximate inference in dynamic systems is the problem of estimating the state of the system given a sequence of actions and partial observations. High precision estimation is fundamental in many applications like diagnosis, natural language processing, tracking, planning, and robotics. In this paper we present an algorithm that samples possible deterministic executions of a probabilistic sequ...
متن کامل